18 research outputs found

    Automatic Extraction and Assessment of Entities from the Web

    Get PDF
    The search for information about entities, such as people or movies, plays an increasingly important role on the Web. This information is still scattered across many Web pages, making it more time consuming for a user to find all relevant information about an entity. This thesis describes techniques to extract entities and information about these entities from the Web, such as facts, opinions, questions and answers, interactive multimedia objects, and events. The findings of this thesis are that it is possible to create a large knowledge base automatically using a manually-crafted ontology. The precision of the extracted information was found to be between 75–90 % (facts and entities respectively) after using assessment algorithms. The algorithms from this thesis can be used to create such a knowledge base, which can be used in various research fields, such as question answering, named entity recognition, and information retrieval

    Automatic Extraction and Assessment of Entities from the Web

    Get PDF
    The search for information about entities, such as people or movies, plays an increasingly important role on the Web. This information is still scattered across many Web pages, making it more time consuming for a user to find all relevant information about an entity. This thesis describes techniques to extract entities and information about these entities from the Web, such as facts, opinions, questions and answers, interactive multimedia objects, and events. The findings of this thesis are that it is possible to create a large knowledge base automatically using a manually-crafted ontology. The precision of the extracted information was found to be between 75–90 % (facts and entities respectively) after using assessment algorithms. The algorithms from this thesis can be used to create such a knowledge base, which can be used in various research fields, such as question answering, named entity recognition, and information retrieval

    WebKnox: Web Knowledge Extraction

    Get PDF
    This thesis focuses on entity and fact extraction from the web. Different knowledge representations and techniques for information extraction are discussed before the design for a knowledge extraction system, called WebKnox, is introduced. The main contribution of this thesis is the trust ranking of extracted facts with a self-supervised learning loop and the extraction system with its composition of known and refined extraction algorithms. The used techniques show an improvement in precision and recall in most of the matters for entity and fact extractions compared to the chosen baseline approaches

    Automatic Extraction and Assessment of Entities from the Web

    No full text
    The search for information about entities, such as people or movies, plays an increasingly important role on the Web. This information is still scattered across many Web pages, making it more time consuming for a user to find all relevant information about an entity. This thesis describes techniques to extract entities and information about these entities from the Web, such as facts, opinions, questions and answers, interactive multimedia objects, and events. The findings of this thesis are that it is possible to create a large knowledge base automatically using a manually-crafted ontology. The precision of the extracted information was found to be between 75–90 % (facts and entities respectively) after using assessment algorithms. The algorithms from this thesis can be used to create such a knowledge base, which can be used in various research fields, such as question answering, named entity recognition, and information retrieval

    WebKnox: Web Knowledge Extraction

    No full text
    This thesis focuses on entity and fact extraction from the web. Different knowledge representations and techniques for information extraction are discussed before the design for a knowledge extraction system, called WebKnox, is introduced. The main contribution of this thesis is the trust ranking of extracted facts with a self-supervised learning loop and the extraction system with its composition of known and refined extraction algorithms. The used techniques show an improvement in precision and recall in most of the matters for entity and fact extractions compared to the chosen baseline approaches

    Areca: Online Comparison of Research Results

    No full text
    To experiment properly, scientists from many researchareas need large sets of real world data. Information re-trieval scientists for example often need to evaluate theiralgorithms on a dataset or a gold standard. The availabil-ity of these datasets often is insufficient and authors withthe same goal do not evaluate their approaches on thesame data. To make research results more transparentand comparable, we introduce Areca, an online portalfor sharing datasets and/or the results that were reachedwith the author’s algorithms on these datasets. Havingsuch an online comparison makes it easier to grasp thestate-of-the-art on certain tasks and drive research toimprove the results

    An Optimized Web Feed Aggregation Approach for Generic Feed Types

    No full text
    Web feeds are a popular way to access updates for contentin the World Wide Web. Unfortunately, the technology be-hind web feeds is based on polling. Thus, clients ask the feedserver regularly for updates. There are two concurrent prob-lems with this approach. First, many times a client asks forupdates, there is no new item and second, if the client’s up-date interval is too large it might be notified too late or evenmiss items. In this work we present adaptive feed polling algorithms. Thealgorithms learn from the previous behaviors of feeds andpredict their future behaviors. To evaluate these algorithmswe created a real set of over 180,000 diversified feeds andcollected a dataset of their updates for a time of three weeks.We tested our adaptive algorithms on this set and show thatadaptive feed polling reduces traffic significantly and pro-vides near-real-time updates

    Seroprevalence of <i>Brucella</i> Infection in Wild Boars (<i>Sus scrofa</i>) of Bavaria, Germany, 2019 to 2021 and Associated Genome Analysis of Five <i>B. suis</i> Biovar 2 Isolates

    No full text
    Brucella species are highly pathogenic zoonotic agents and are found in vertebrates all over the world. To date, Germany is officially declared free from brucellosis and continuous surveillance is currently limited to farm ruminants. However, porcine brucellosis, mostly caused by B. suis biovar 2, is still found in wild boars and hares. In the present study, a three-year monitoring program was conducted focusing on the wild boar population in the state of Bavaria. Serologic screening of 11,956 animals and a direct pathogen detection approach, including a subset of 681 tissue samples, was carried out. The serologic incidence was 17.9%, which is in approximate accordance with previously published results from various European countries. Applying comparative whole genome analysis, five isolated B. suis biovar 2 strains from Bavaria could be assigned to three known European genetic lineages. One isolate was closely related to another strain recovered in Germany in 2006. Concluding, porcine brucellosis is endemic in Bavaria and the wild boar population represents a reservoir for genetically distinct B. suis biovar 2 strains. However, the transmission risk of swine brucellosis to humans and farm animals is still regarded as minor due to low zoonotic potential, awareness, and biosafety measures
    corecore